High-Accuracy Identification of Incident HIV-1 Infections Using a Sequence Clustering Based Diversity Measure
نویسندگان
چکیده
Accurate estimates of HIV-1 incidence are essential for monitoring epidemic trends and evaluating intervention efforts. However, the long asymptomatic stage of HIV-1 infection makes it difficult to effectively distinguish incident infections from chronic ones. Current incidence assays based on serology or viral sequence diversity are both still lacking in accuracy. In the present work, a sequence clustering based diversity (SCBD) assay was devised by utilizing the fact that viral sequences derived from each transmitted/founder (T/F) strain tend to cluster together at early stage, and that only the intra-cluster diversity is correlated with the time since HIV-1 infection. The dot-matrix pairwise alignment was used to eliminate the disproportional impact of insertion/deletions (indels) and recombination events, and so was the proportion of clusterable sequences (Pc) as an index to identify late chronic infections with declined viral genetic diversity. Tested on a dataset containing 398 incident and 163 chronic infection cases collected from the Los Alamos HIV database (last modified 2/8/2012), our SCBD method achieved 99.5% sensitivity and 98.8% specificity, with an overall accuracy of 99.3%. Further analysis and evaluation also suggested its performance was not affected by host factors such as the viral subtypes and transmission routes. The SCBD method demonstrated the potential of sequencing based techniques to become useful for identifying incident infections. Its use may be most advantageous for settings with low to moderate incidence relative to available resources. The online service is available at http://www.bioinfo.tsinghua.edu.cn:8080/SCBD/index.jsp.
منابع مشابه
Fingerprinting and genetic diversity evaluation of rice cultivars using Inter Simple Sequence Repeat marker
Rice as one of the most important agricultural crops has a putative potential for ensuring food security and addressing poverty in the world. In the present study, in order to provide basic information to improve rice through breeding programs, Inter Simple Sequence Repeat marker (ISSR) was used For DNA fingerprinting and finding genetic relationships among 32 different cultivars. In this study...
متن کاملA Generalized Entropy Measure of Within-Host Viral Diversity for Identifying Recent HIV-1 Infections
There is a need for incidence assays that accurately estimate HIV incidence based on cross-sectional specimens. Viral diversity-based assays have shown promises but are not particularly accurate. We hypothesize that certain viral genetic regions are more predictive of recent infection than others and aim to improve assay accuracy by using classification algorithms that focus on highly informati...
متن کاملThe ensemble clustering with maximize diversity using evolutionary optimization algorithms
Data clustering is one of the main steps in data mining, which is responsible for exploring hidden patterns in non-tagged data. Due to the complexity of the problem and the weakness of the basic clustering methods, most studies today are guided by clustering ensemble methods. Diversity in primary results is one of the most important factors that can affect the quality of the final results. Also...
متن کاملHIV-1 envelope sequence-based diversity measures for identifying recent infections
Identifying recent HIV-1 infections is crucial for monitoring HIV-1 incidence and optimizing public health prevention efforts. To identify recent HIV-1 infections, we evaluated and compared the performance of 4 sequence-based diversity measures including percent diversity, percent complexity, Shannon entropy and number of haplotypes targeting 13 genetic segments within the env gene of HIV-1. A ...
متن کاملGenetic Diversity of Marrubium Species from Zagros Region (Iran), Using Inter Simple Sequence Repeat Molecular Marker
This study concerns the genetic diversity and taxonomic status of Marrubium species from central and south-west of Zagros region, Iran. It is investigated by Inter-Simple Sequence Repeat analysis. A total of 68 accessions from five Marrubium species were collected from their natural habitats. Molecular analysis was approved with 17 primers, of which 12 were carried out in the reaction mixture. ...
متن کامل